Identification of prognostic hub genes and therapeutic targets for selenium deficiency in chicks model through transcriptome profiling

Selenium deficiency is a prevalent micronutrient deficiency that poses a major health concern worldwide. This study aimed to shed light on the molecular mechanisms underlying selenium deficiency using a chick model. Chickens were divided into control and selenium deficient groups. Plasma samples were collected to measure selenium concentration and transcriptome analyse were performed on oviduct samples. The results showed that selenium deficiency led to a significant reduction in plasma selenium levels and altered the expression of 10,266 differentially expressed genes (DEGs). These DEGs primarily regulated signal transduction and cell motility. The molecular function includes GTPase regulatory activity, and KEGG pathway analysis showed that they were mainly involved in the signal transduction. By using Cytoscape and CancerGeneNet tool, we identified 8 modules and 10 hub genes (FRK, JUN, PTPRC, ACTA2, MST1R, SDC4, SDC1, CXCL12, MX1 and EZR) associated with receptor tyrosine kinase pathway, Wnt and mTOR signaling pathways that may be closely related to cancer. These hub genes could be served as precise diagnostic and prognostic candidate biomarkers of selenium deficiency and potential targets for treatment strategies in both animals and humans. This study sheds light on the molecular basis of selenium deficiency and its potential impact on public health.

Selenium (Se) is an essential trace element that plays critical roles in various metabolic processes, reproduction, antioxidant defence, and immune system of animals and humans [1][2][3] . The first nutritional role of selenium (Se) was recognized in livestock species in 1950, which showed that Se prevented pathologies in vitamin E-deficient animals 4 . Several reports suggest that dietary selenium supplementation can prevent certain forms of chronic diseases, such as cardiovascular diseases, diabetes, and cancer 5 . Plasma Se content is considered a useful biomarker of both Se status and dietary intake, and deficient selenium status has been associated with increased risk of diseases such as cardiovascular diseases, cancer, and viral infections like COVID-19 (coronavirus disease 2019), and HIV (human immunodeficiency virus) 6,7 . Previous studies have demonstrated that selenium deficiency can lead to impaired growth, which in turn can result in the development of pancreatic atrophy, exudative diathesis, and muscular dystrophy 8 . These conditions are often linked together and are caused by various types of lesions 8 . Pancreatic atrophy causes degenerative changes in the pancreas, exudative diathesis results in increased capillary fragility and haemorrhage, while muscular dystrophy leads to muscle wasting and oxidative stress 9,10 . Furthermore, dietary Se deficiency has also been found to be deleterious effects on egg production, and egg weight in laying hens and turkeys 11 . Functional analysis of differentially expressed genes (DEGs). The g: Profiler web server, a public functional annotation tool, was utilized to analyse the up and downregulated genes in chickens and gain insights into various Gene Ontology (GO) terms 19 . The official upregulated gene symbols were uploaded to the webserver, and G. gallus was chosen as the reference genome, resulting in 194 annotated genes across three GO terms: biological process (BP), cellular component (CC), and molecular function (MF) (Fig. 4a). Similarly, 146 downregulated genes were annotated across the three GO terms (Fig. 4b).
Further GO analysis was conducted on 213 upregulated and 237 downregulated differentially expressed genes (DEGs), assigning them to 90 enriched GO terms. The upregulated DEGs were annotated to 30 biological processes, two molecular functions, and 30 cellular components, with significant terms such as "cellular component movement, " "regulation of cell motility, " "regulation of signalling, " "regulation of cell migration, " "positive regulation of chemotaxis, " "regulation of signal transduction, " "regulation of locomotion, " and "cell communication. " Cellular component GO terms included "endoplasmic reticulum, " "endomembrane system, " "membrane, " and "vesicle. " Molecular functions enriched with "protein binding" and "protein-containing complex binding" were identified.
The downregulated differentially expressed genes (DEGs) were assigned to 30 BP (biological process), 30 MF (molecular function), and 30 CC (cellular component) Gene Ontology (GO) terms, with significant terms such as "cell surface receptor signalling pathway, " "signal transduction, " "cell adhesion, " "cell communication, " and "cell cycle. " The most relevant terms for molecular function were "binding, " "GTPase regulatory activity, " "inorganic cation transmembrane transporter activity, " and "carbohydrate binding. " Cellular component GO terms included "actin bundle filaments, "actin cytoskeleton, " and "phagocytic vesicle membrane. " All the identified Gene Ontology (GO) terms for both upregulated and downregulated genes are summarized in Tables 1 and 2, respectively. Pathway enrichment analysis of up-and downregulated genes of differentially expressed genes (DEGs) in chickens was conducted using the Kyoto encyclopedia of genes and genomes (KEGG) database. The study  www.nature.com/scientificreports/ identified 10, 266 differentially expressed genes (DEGs) in the control versus selenium deficient groups, which were classified into 23 functional pathway categories in the Kyoto encyclopedia of genes and genomes (KEGG) database. These pathways were further grouped into five specific categories in the Kyoto encyclopedia of genes and genomes (KEGG) database. These pathways were further grouped into five specific categories, namely, metabolism (A), genetic information processing (B), environmental information processing (C), cellular processes (D), and organismal system (E), as shown in Table 3.
Within the environmental information processing category, the most abundant subcategory was "signal transduction, " with a gene count of 1280. In the cellular processes category, the predominant group was "transport and catabolism, " with a gene count of 482. The genetic information processing category had the largest subgroup called "translation, " with a gene count of 312. In the metabolism category, two categories were more common than others namely "lipid metabolism" (gene count: 260) and "carbohydrate metabolism" (gene count: 208). Last, the organismal systems category had the most abundant group called "environmental adaptation" with a gene count of 222. Furthermore, the Kyoto encyclopedia of genes and genomes (KEGG) annotation statistics are provided in Supplementary Table S11. Identification of module and hub genes through protein _ protein interaction (PPI) network analysis. According to the analysis of the STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) database, the protein _ protein interaction (PPI) network of 324 DEGs was observed to have 324 nodes and 439 edges after applying appropriate filters. The expected number of edges was 305, and the average node degree was 2.71, indicating that each node had at least 2.71 interacting nodes. The average local clustering coefficient was 0.373, and the protein _ protein interaction (PPI) enrichment value was 3.56e −13 (Fig. 5a).
To investigate the upregulated and downregulated differentially expressed genes (DEGs), we imported all 324 DEGs into the STRING database and visualized the network using Cytoscape App software. In the resulting network, yellow nodes represented upregulated DEGs, and green nodes represented downregulated DEGs (Fig. 5b).
To identify functional modules within the network, we performed a functional module analysis using the Molecular Complex Detection (MCODE) plug-in in Cytoscape software. The analysis revealed a total of 8 www.nature.com/scientificreports/  www.nature.com/scientificreports/  www.nature.com/scientificreports/ functional modules ( Fig. 6a-h). The top-scoring module was Module 1, which consisted of 7 nodes and 21 edges with a score of 7 (Fig. 6a). Module 2 included 6 nodes and 12 edges with a score of 4.8 (Fig. 6b), and Module 3 consisted of 19 nodes and 35 edges with a score of 3.889 (Fig. 6c). Module 4 included ANKRD2, CSRP2, and FHL2, and had 3 nodes and 3 edges with a score of 3 (Fig. 6d). Module 5 contained SEC16B, ETV5, and TMEM18, and had 3 nodes and 3 edges with a score of 3 (Fig. 6e). Module 6 had ERO1L, CLGN, and TXNDC5, and had 3 nodes and 3 edges with a score of 3 (Fig. 6f). Module 7 included CLDN10, CLDN1, and OCLN, and had 3 nodes and 3 edges with a score of 3. Module 8 also had 3 nodes and 3 edges with a score of 3 ( Fig. 6g and h). The detailed results of the functional module analysis are provided in Supplementary Tables S12-S20. The hub genes, which are selected based on their high scores in the degree algorithm ranking in CytoHubba (Fig. 7a), are comprised of FRK (fyn-related Src family tyrosine kinase), JUN (jun proto-oncogene), PTPRC (protein tyrosine phosphatase receptor type C), ACTA2 (actin alpha 2), MST1R (macrophage-stimulating 1 receptor), SDC4 (syndecan 4), SDC1 (syndecan 1), CXCL12, MX1 (MX dynamin like GTPase 1), and EZR (ezrin), as shown in Fig. 7b.
Identification of cancer-related gene networks using hub genes. Subsequently, these ten hub genes underwent CancerGeneNet network analysis, resulting in three different types of graphs. In Fig. 8a, a direct connection was observed between the query hub genes at Level 1. Figure 8b shows that the hub cancer genes were connected and interacted with two cancer genes, forming the first neighbours at Level 2. Figure 8c displays the protein-to-protein (PPI) interactions, which were related to external stimuli, including angiogenesis, metastasis,  www.nature.com/scientificreports/ apoptosis, monocyte differentiation, cell migration, brown adipogenesis, and proliferation, at Level 3. Furthermore, Table 4 provides information on the enrichment of hub genes involved in cancer pathways.

Discussion
Recently, dietary selenium-deficiency has been associated with many diseases in animals and humans 20 . Therefore, we have taken stepwise approaches to verify and support our findings. First, we fed chicks a low-Se diet for 10 weeks, and then, we observed that the Se content in the blood samples was significantly lower in the   www.nature.com/scientificreports/ selenium-deficient group than in the control group. These results indicate that the selenium-deficient model of chicks has been established successfully. Consistent with our findings, previous studies have demonstrated that lower dietary selenium induced severe selenium-deficiency in chicks 21 . Second, we performed gene expression profiling using RNA-Seq of oviduct samples from the control and selenium-deficient groups. We identified the maximum number of upregulated DEGs (210) and downregulated DEGs (230). The upregulated genes included AvBD12 (avian beta-defensin 12), ATP2A2 (ATPase sarcoplasmic/ endoplasmic reticulum Ca 2+ transporting 2), CXCL-12 (cxc motif chemokine ligand 12), POF1B (premature ovarian failure protein 1B), SARAF (store-operated calcium entry associated regulatory factor), SLC12A2 (solute carrier family 12 member 2), TRAIL-like (tumour necrosis factor-related apoptosis inducing ligand), and TXNDC5 (thioredoxin domain containing 5). Downregulated genes such as AVBD (avian beta-defensin 9), and CABP2 (calcium binding protein 2), were activated under selenium deficiency conditions. These genes are involved in various functions such as chemoattractant for avian immune cells 22 , calcium transporters 14 , chemokine members 23 , premature ovarian failure 24 , store-operated calcium entry associated regulatory factor 25 , solute carrier protein 13 , TNF-related apoptosis inducing ligand-like protein 26 , thioredoxin domains 27 , and calcium binding protein similar to calmodulin 28 .
Third, GO annotation and KEGG analyses were used to further elucidate the biological roles of the DEGs during the dietary selenium-deficiency response. Fourth, we constructed a protein _ protein interaction (PPI) network using the STRING database and detected 8 functional modules and 10 hub genes using the CytoScape software plugins cytoHubba and MCODE. Finally, enrichment analysis of hub genes was performed using Can-cerGeneNet software. These findings indicated that the hub genes could be potential prognostic biomarkers or therapeutic targets for selenium-deficiency in animal and human models.
In this study, a total of 324 DEGs, 194 upregulated genes and 140 downregulated genes, were identified. Next, the 194 upregulated DEGs were subjected to GO and KEGG pathway enrichment analyses. The biological process (BP) GO terms showed that genes were related to the regulation of cell motility, cell migration, positive regulation of chemotaxis, regulation of signal transduction, cell communication, signalling, locomotion, and cellular component movement. The cellular component (CC) GO terms showed that the genes were associated with the endoplasmic reticulum, endomembrane system, membrane and vesicle. The molecular function (MF) GO showed that the genes were related to protein-containing complex binding and protein binding activity. These ontologies are represented by genes involved in the impaired immune system and some genes related to tumorigenesis. Previous studies showed that selenium modulates intracellular signalling and cell growth 29 .
Furthermore, significant DEGs between the control and selenium-deficient samples were mapped to reference canonical pathways in the KEGG database. A total of 10,266 of the control and selenium-deficient groups were categorized into 23 major KEGG pathway and were classified into five larger pathway categories: metabolism, genetic information processing, environmental information processing, cellular processes, and organismal systems ( Table 3). The highest number of gene-associated categories was assigned to the environmental information processing category, followed by metabolism, cellular process, and genetic information processing, whereas the lowest gene count was related to the organismal systems category. In terms of signal transduction, many pathways such as MAPK (mitogen-activated protein kinase), ErbB (erythroblastic oncogene B), Ras (rat sarcoma virus), Wnt (wingless-related integration site), Rap1 (ras-related protein 1), Notch, TGF-beta (transforming growth factor beta), Hippo, Apelin, NF-kappa B (nuclear factor-kB), PI3K-Akt (phosphoinositide-3-kinase-protein kinase B/ www.nature.com/scientificreports/ Ak strain transforming), cAMP (cyclic adenosine monophosphate), and phospholipase D indicate a large amount of signal generation during selenium deficiency conditions. Earlier studies revealed that selenium deficiency caused increased cancer risk 30 .
In our study, we identified hub genes according to the degree algorithm score in cytoHubba. The top ten highest-scored genes were selected as the hub genes: FRK (fyn-related src family tyrosine kinase), JUN (jun proto-oncogene), PTPRC (protein tyrosine phosphatase receptor type C), ACTAC2 (smooth muscle actin alpha 2), MST1R (macrophage-stimulating 1 receptor), SDC4 (syndecan 4), SDC1 (syndecan 1), CXCL12 (c-x-c motif chemokine ligand 12), MX1 (MX dynamin like GTPase 1) and EZR (ezrin). FRK, the top hub gene, encodes a Fyn-related Src family tyrosine kinase 31 . Previous studies have reported that the FRK gene plays an important role during oncogenesis and progression and is regulated in a tissue specific manner 31 .
The second hub gene was JUN, which and belongs to the basic region leucine zipper family, and constitutes activator protein-1 (AP-1) thereby regulating transcription and cellular activities such as proliferation, tissue morphogenesis, apoptosis, and tumorigenesis 32 . PTPRC (protein tyrosine phosphatase receptor type C), the third hub gene, encodes a member of the protein tyrosine phosphatase (PTP) family 33 . PTPRC is also known as CD45 and it functions as a signalling gatekeeper in T cell 34 . It has been reported that PTPRC is related to multiple kinds of malignancies, such as gastric cancer and breast cancer 33 The ACTA2 gene encodes smooth muscle specific α-actin, a component of the contractile apparatus of vascular smooth muscle cells 34 . Lee reported that ACTA2, which is involved in mechanical tension, cell movement and shape resulted in dynamics of cytoskeletal structures for invasion and metastasis in tumors 35 . The MST1R gene encodes a protein that functions as a tyrosine kinase receptor for macrophage-stimulating protein 36 . Wang reported that MST1R plays an important role in host defence including tissue pairing and inhibition of inflammation induced by pathogens 37 . Sakamoto et al. 38 showed that higher MST1R expression led to increased ciliary motility which prevented chronic infection.
SDC4 (syndecan-4) is an important hub gene. It is ubiquitously expressed, and a transmembrane proteoglycan bearing heparan sulfate chains 39 . Keller-Pinter suggested that SDC4 (syndecan-4), plays several roles in growth factor binding, small GTPase Rac1 activity, intracellular calcium-level regulation, extracellular matrix component and focal adhesion kinase phosphorylation regulation 40 . SDC4 (syndecan-4) impaired function causes the development of breast cancer, prostate cancer and many other cancers 41,42 . The SDC1 gene (encodes the protein of syndecan-1) is an evolutionarily conserved type I transmembrane protein family and lack a common molecular structure 43 . Several studies have shown that the expression of syndecan-1 is different in different cancer types 44 .
The CXCL12 gene, also known as stromal cell-derived factor-1 (SDF-1) is a crucial chemokine that is involved in embryogenesis, haematopoiesis, lymphopoiesis, angiogenesis, and inflammation 45 . Previous studies have shown that CXCL12 is related to various cancers, including pancreatic cancer, colorectal cancer, breast cancer, and cervical cancer [46][47][48] . Myxovirus resistance protein 1 (MX1) is a GTPase that inhibits the multiplication replication of many RNA viruses and is a downstream target for the type I IFN pathway of the innate response 49 . Racicot found that MX1 (Myxovirus resistance protein 1) is an essential component of exosomes secreted by uterine epithelial cells in avian 50 . Another study by Feng et al. 51 suggested that Myxovirus resistance protein 1 is an important marker for organ damage.
EZR (encodes the ezrin protein) is another important hub gene and is a member of the ezrin-radixin-moesin (ERM) protein family 52 . Li, reported that ezrin is mainly expressed on top of the cell surface and maintains the polarity of epithelial cells 53 . Recent studies have found that ezrin participates in the invasion and metastasis of cancer cells 54 .
Furthermore, we used the CancerGeneNet network tool 55 , and we identified that enrichment analysis of the cancer module yielded more hits to AML1-ETO in AML (acute myeloid leukaemia). These studies demonstrate that these 10 hub genes are correlated with different cancers and are consistent with our results, which predicted that they have the potential to become cancer biomarkers. Hence, the top 10 hub genes were related to cancer pathways.
In summary, we first compared the expression of all genes between selenium-deficient and control oviduct samples of chicks by RNA-Seq and found significant changes in selenium status linked with various genes involved in selenium-deficiency. Subsequently, we also performed integrative bioinformatic analysis to identify the key genes, and potential pathways involved in the dietary selenium response. In this context, FRK (fynrelated Src family tyrosine kinase), JUN (jun proto-oncogene), PTPRC (protein tyrosine phosphatase receptor type C), ACTAC2 (smooth muscle actin alpha 2), MST1R (macrophage-stimulating 1 receptor), SDC4 (syndecan 4), SDC1 (syndecan 1), CXCL12 (c-x-c motif chemokine ligand 12), MX1 (MX dynamin like GTPase 1) and EZR (ezrin) were identified and verified as hub genes. These hub genes are related to different cancer pathways. Hence, our study provides a strong basis for future cancer and selenium-deficiency targeted therapies, and these 10 hub genes could potentially be new selenium-deficiency gene targeted therapies. Further studies are needed to confirm our findings to determine the exact mechanism behind selenium deficiency in normal humans and their related complications.

Methods
Statement. The study followed the guidelines of ARRIVE.

Birds.
A total of thirty-six, 16-week-old white leghorn chickens with an average initial weight of 55.8 g were used. All birds were purchased from Shree Sai Krishna Commercial Poultry Farm, Namakkal, Tamil Nadu, India.
Experimental design. Chickens were randomly allotted into two dietary treatment groups, the control (basal diet) and selenium-deficient diet groups. Each group consisted of 6 birds with three replicates in 18 differ- www.nature.com/scientificreports/ ent cages (two birds per cage). Each cage was equipped with one nipple cup, and one feeder and the cage size was H40 × W40 × D40 cm. All birds were offered free access to water and an experimental diet ad libitum, followed by a 14 L:10 D lighting schedule at an intensity of 40 lx. The common infectious diseases were prevented prophylactically by administering Clostin sulphate, Neomycin sulphate and spectinomycin in the drinking water 56 . Throughout the study, daily observations were conducted to record the overall health of the subjects, as well as any clinical symptoms related to selenium deficiency diseases and mortality and detailed incidents of ED based on gross appearance 57 . All birds are safeguarded against viral infection administrating the Newcastle disease virus (NDV) vaccine through both the intraocular and intranasal routes 58 .
Establishment of a dietary selenium-deficient chick model. A basal diet was formulated for chicks, utilizing corn-soya bean meal and adhering to the National Research Council (NRC, 1994) guidelines 59 for their nutritional requirements, as demonstrated in Table 5. The chicks in the control group were fed with basal diet alone for 10 weeks, which contained 0.012 mg/kg of selenium. In contrast, the chicks in the selenium deficient www.nature.com/scientificreports/ group were given a diet with a selenium content of only 0.002 mg/kg for 10 weeks, with the purpose of inducing state of selenium deficiency in this group, as illustrated in Statistical analysis. The statistical analysis was performed by IBM SPSS 20.0 Software (SPSS Inc., Chicago, IL, United States). Statistical significance between groups was determined by one-way ANOVA followed by a post hoc Tukey HSD comparison test to identify differences in means among dietary treatments for selenium concentration in blood samples. *Asterisk indicates statistical significance (p < 0.05).
Sample collection. Blood samples were taken from the main wing vein and collected into an anticoagulant tube after 10 weeks during the whole feeding period. Plasma was separated by centrifugation at 4 °C, and 3000 rpm for 10 min and stored at − 30 °C for further analysis. After dietary treatments for 10 weeks, 3 randomly chosen chickens from each group were slaughtered 15-20 h post-ovulation. The oviduct was sampled and frozen in liquid nitrogen immediately. All samples were stored at − 80 °C prior to further analysis for RNA-Seq.
Determination of selenium content. The selenium content in the diet and blood samples was measured according to Pan et al. 61 with an atomic fluorescence spectrometer (AF-610A, Beijing Beifen-Ruili Analytical Instrument, Yangzhou, China).
Total RNA isolation and qualitative and quantitative analysis. Total  Illumina NextSeq500 PE library preparation. The RNA-Seq paired end sequencing libraries were prepared from the QC passed RNA samples using an Illumina TruSeq Strand mRNA sample prep kit. Briefly, mRNA was enriched from the total RNA using poly-T attached magnetic beads, followed by enzymatic fragmentation, 1st strand cDNA conversion using SuperScript II and Act-D mix to facilitate RNA dependent synthesis. The 1st strand cDNA was then synthesized to the second strand using a second strand mix. The double cDNA was then purified using AMPure XP beads followed by A-tailing, and adapter ligation and then enriched by limiting the number of PCR cycles.

Quantity and quality check (QC) of library on Agilent 4200 Tape Station. The PCR enriched
libraries were analysed on a 4200 Tape Station system (Agilent Technologies) using high sensitivity D1000 Screen Tape as per the manufacturer's instructions.
Cluster generation and sequencing. After we obtained the Qubit concentration for the libraries and the mean peak sizes from the Agilent Tape Station profile, the PE Illumina libraries were loaded onto NextSeq500 for cluster generation and sequencing. Paired-End sequencing allows the template fragments to be sequenced in both the forward and reverse directions on NextSeq500. The kit reagents were used in the binding of samples to complementary adapter oligos on paired-end flow cells. The adapters were designed to allow selective cleavage of the forward strands after resynthesis of the reverse strand during sequencing. The copied reverse strand was used to sequence from the opposite end of the fragment.
High quality read statistics. The sequenced raw data of the control and selenium-deficient groups were processed to obtain high quality clean reads using Trimmomatic version 0.38 to remove adapter sequences, ambiguous reads (reads with unknown nucleotides "N" > 5%), and low-quality sequences (reads with more than 10% quality threshold (QV) less than 20 phred score). Furthermore, a minimum length of 100 nt (nucleotide) after trimming was added. High-quality reads were obtained after removing the adapter and low-quality sequences from the raw data. This high quality (QV less than 20), paired-end reads were used for reference _ based read mapping. The parameters considered for filtration are as follows: (1)  Reference genome information. The  Heatmap. An average linkage hierarchical cluster analysis was performed on the top 50 differentially expressed genes (DEGs), of the control versus selenium-deficient group combination using a multiple experiments viewer (MeV v4.9.0) 64 . The heatmap shows the level of gene abundance. Levels of expression are indicated as the log 2 ratio of gene abundance between control and selenium-deficient samples. Differentially expressed genes (DEGs) were analysed by hierarchical clustering. In this regard, heatmaps were constructed using the log transformed and normalized value of genes according the Pearson uncentred distance and average linkage method. In the heatmaps, each horizontal line represents a gene. The colour denotes the logarithmic intensity of the expressed genes; relatively high expression values are shown in red colour and lower expression is shown in green.
Scatter plot. The Eurofins proprietary R script was used to graphically depict the expression of genes in two distinct conditions of each sample combination i.e., control and selenium-deficient groups. It helps to identify genes that are differentially expressed in one sample with respect to another and allows comparing two values associated with genes. In the scatter plot, each dot denotes a gene. The vertical position of each gene represents its expression level in the control samples while the horizontal position represents its expression level in the selenium-deficient group. Moreover, genes that fall above the diagonal are indicated to be overexpressed. Conversely, genes that fall below the diagonal are denoted as underexpressed as compared to their median expression level in the experimental grouping of the experiment.
Volcano plot. The Eurofins proprietary R script was used to depict the graphical representation and distribution of differentially expressed genes that were found in control as well as selenium-deficient diet groups. The 'volcano plot' displayed expressed genes along dimensions of biological as well as statistical significance. The red colour block on the right side of zero represents the upregulated genes whereas the green colour block on the left side of zero represents significant downregulated genes. The Y-axis denotes the negative log of the p value of the performed statistical test where data points with low p values are represented as highly significant and appear towards the top of the plot. The grey colour block shows the nondifferentially expressed genes.
Gene Ontology (GO) and pathway analysis. Gene Ontology was performed by using g: Profile, a public web server (http:// biit. cs. ut. ee/ gprofi ler/), and the parameters for the enrichment analysis were as follows. A specific organism was chosen "Gallus gallus". GO analyses (GO: BP (biological process); GO: CC (cellular component); GO: MF (molecular function) 19 . The functional annotations of genes were used against the curated KEGG (Kyoto Encyclopedia of Genes and Genomes) GENES database using KAAS (KEGG Automatic Annotation Server-. (http:// www. genome. jp/ kegg/ ko. html) 65 . The KEGG Orthology database of the "Gallus" family was used as the reference for pathway mapping. The result contains KEGG Orthology (KO) assignments and automatically generated KEGG pathways using the KAAS BBH bidirectional best hit (BBH) method against the available database.
Protein _ protein interaction (PPI) network construction and module analysis. The STRING (Search Tool for the Retrieval of interacting Genes/Proteins) database (https:// string-db. org/) is one of the largest databases that searches for protein _ protein interactions 66 . We selected "Multiple proteins" in the left column and entered gene names in the right column, and then picked the organism as "Gallus gallus", chose the minimum-required interaction score as "medium confidence as 0.400" and fixed disconnected nodes in the network, clicked the "Export" option, downloaded the file in TSV format (Supplementary Table S21) and imported it into Cytoscape software (version 3.9.1; https:// cytos cape. org/) which is a very powerful tool for visualizing network data 67 . Then, a plug-in in Cytoscape MCODE (molecular complex detection) version 1.5.1 68 was used to cluster the protein network to build functional modules. The default parameters were degree cut-off = 2, node density cut-off = 0.1, node score cut-off = 0.2, K-core = 2 and max. depth = 100.
Identification of enrichment analysis of hub genes. The hub genes of the PPI network of DEGs were evaluated by using the cytoHubba (version 1.6) plugin in CytoScape (version 3.9.1) to identify the top 10 hub genes according to the degree algorithm 69 . Enrichment analysis of hub genes by the CancerGeneNet database is a freely available resource with graphical representations of directed interactions between proteins (https:// signor. uniro ma2. it/ Cance rGene Net/) 70 Table S12, S13, S14, S15, S16, S17, S18, S19 & S20 www.nature.com/scientificreports/